{
 "nbformat": 4,
 "nbformat_minor": 0,
 "metadata": {
  "colab": {
   "provenance": []
  },
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3"
  },
  "language_info": {
   "name": "python"
  }
 },
 "cells": [
  {
   "cell_type": "markdown",
   "source": "# Running RAG Completion with Nebius LLM and Embedding Models\n\nThis notebook demonstrates how to build a **Retrieval-Augmented Generation (RAG)** system using Nebius Token Factory. Nebius Token Factory provides access to a variety of state-of-the-art LLM models. You can check out the full list of available models [here](https://studio.nebius.ai/).\n\nVisit [Nebius Token Factory](https://studio.nebius.ai/) and sign up to obtain an API key.",
   "metadata": {
    "id": "_NV3KDMl7CEJ"
   }
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Installation of Required Libraries"
   ],
   "metadata": {
    "id": "RCP37QIU4mGx"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "%pip install llama-index-llms-nebius llama-index-embeddings-nebius"
   ],
   "metadata": {
    "id": "Mu76Wuu5d02E"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "id": "B2F6b52C6xzu"
   },
   "outputs": [],
   "source": [
    "!pip install -U llama-index"
   ]
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Setting Up Environment Variables\n",
    "\n"
   ],
   "metadata": {
    "id": "ziPtLOA-d4nL"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "# set api key in env or in llm\n",
    "import os\n",
    "os.environ[\"NEBIUS_API_KEY\"] = \"your api key\"\n"
   ],
   "metadata": {
    "id": "U9kITjzF7Cvo"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Importing Required Modules\n",
    "\n",
    "We will import the necessary modules from llama-index to work with Nebius LLM and embeddings."
   ],
   "metadata": {
    "id": "6TwDSwui5EBG"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "from llama_index.core import SimpleDirectoryReader,Settings, VectorStoreIndex\n",
    "from llama_index.embeddings.nebius import NebiusEmbedding\n",
    "from llama_index.llms.nebius import NebiusLLM"
   ],
   "metadata": {
    "id": "K0JlJ1uz5GK7"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Defining a Function for RAG Completion\n",
    "\n",
    "This function initializes the Nebius LLM and embedding models, loads documents, creates an index, and retrieves relevant information based on the query.\n",
    "\n",
    "Runs retrieval-augmented generation (RAG) using Nebius models.\n",
    "    \n",
    "Parameters:\n",
    "  - document_dir (str): Path to the directory containing documents.\n",
    "  - query_text (str): The user query for which relevant information needs to be retrieved.\n",
    "  - embedding_model (str): The embedding model to use.\n",
    "  - generative_model (str): The generative model to use.\n",
    "    \n",
    "Returns:\n",
    "  - str: The generated response based on retrieved documents."
   ],
   "metadata": {
    "id": "-G1l8TaH5OII"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "# Provide a template following the LLM's original chat template.\n",
    "def completion_to_prompt(completion: str) -> str:\n",
    "  return f\"<s>[INST] {completion} [/INST] </s>\\n\"\n",
    "\n",
    "\n",
    "def run_rag_completion(\n",
    "    document_dir: str,\n",
    "    query_text: str,\n",
    "    embedding_model: str =\"BAAI/bge-en-icl\",\n",
    "    generative_model: str =\"deepseek-ai/DeepSeek-V3\"\n",
    "    ) -> str:\n",
    "\n",
    "    llm = NebiusLLM(\n",
    "    model=generative_model,\n",
    "    api_key=os.getenv(\"NEBIUS_API_KEY\")\n",
    "    )\n",
    "\n",
    "    embed_model = NebiusEmbedding(\n",
    "        model_name=embedding_model,\n",
    "        api_key=os.getenv(\"NEBIUS_API_KEY\")\n",
    "    )\n",
    "    Settings.llm = llm\n",
    "    Settings.embed_model = embed_model\n",
    "    documents = SimpleDirectoryReader(document_dir).load_data()\n",
    "    index = VectorStoreIndex.from_documents(documents)\n",
    "    response = index.as_query_engine(similarity_top_k=5).query(query_text)\n",
    "\n",
    "    return str(response)"
   ],
   "metadata": {
    "id": "3W8fOXsV7kFE"
   },
   "execution_count": null,
   "outputs": []
  },
  {
   "cell_type": "markdown",
   "source": [
    "## Running the RAG Completion Process\n",
    "\n",
    "We specify the document directory and the query text, then execute the `run_rag_completion` function"
   ],
   "metadata": {
    "id": "naiccHbe5kD0"
   }
  },
  {
   "cell_type": "code",
   "source": [
    "query_text = \"Give me all the details of the invoice in short\"\n",
    "document_dir = \"./data\"\n",
    "\n",
    "response = run_rag_completion(document_dir, query_text)\n",
    "print(response)"
   ],
   "metadata": {
    "id": "71aJVMG8tNmj"
   },
   "execution_count": null,
   "outputs": []
  }
 ]
}